The subword complexity of smooth words on 

2-letter alphabets 

Yun Bao Huang 
Department of Mathematics 
Hangzhou Normal University 
Xiasha Economic Development Area 
Hangzhou, Zhejiang 310036, China 
huangyunbao@sina.com 
huangyunbao@gmail.com 

2010.12.13 



Abstract. Let 7 a be the number of smooth words of length n over the 
alphabet {a, b} with a < b. Say that a smooth word w is left fully extendable 
(LFE) if both aw and bw are smooth. In this paper, we prove that for any 
positive number £ and positive integer uq such that the proportion of b's is 
larger than £ for each LFE word of length exceeding no, there are two constants 
c\ andc2 such that for each positive integer n, one has 

log(2h-l) log(2b-l) 

c\ ■ n 1 °s( 1 +( a + b - 2 )( 1 -0) < j a b (n) < C2 • n 1 °s( 1 +( a +' , - 2 )«) . 

In particular, taking a = 1 and b = 2 in the above inequalities arrives at Huang 
and Weakley's result. Moreover, for 2-letter even alphabet {a, b}, there are two 
suitable constants c%, C2 such that 

log(26-l) log(2b-l) 

c\ ■ n lo s(( a + b )/ 2 ) < 7a,&(n) < C2 • re lo s(( a + 6 )/ 2 )/or each positive integer n. 
Keywords: Derivative; height; smooth word; LFE word. 
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1. Introduction 



The curious Kolakoski sequence K which Kolakoski introduced in [19], is the infinite 
sequence over the alphabet E = {1,2}, which starts with 2 and equals the sequence 
defined by its run lengths: 



Here, a run is a maximal subsequence of consecutive identical symbols. The Kolakoski 
sequence K has received a remarkable attention in [1, 2, 3, • • • ,26]. For research sit- 
uations of the Kolakoski sequence K and related problems before 1996, readers can 
refer to Dekking [12]. 

Keane [17] asked whether the density of l's in K is 0.5. Chvatal [9] proved that 
the upper density of l's as well as the upper density of 2's in K is less than 0.500838. 
Steacy [24] studied the structure in the kolakoski sequence K and obtained some 
conditions which are equivalent to Keane's problem. 

In order to study wether the Kolakoski sequence K is recurrent and/or is closed 
under complement, Dekking [11] introduced the notion of C°°-words over the alphabet 
{1,2} for the first time and noted that the finite factors of K must be C^-words. 
Moreover, he proved that there exists a suitable positive constant c such that c-n 2,15 < 
7(71) < n 7 ' 2 and conjectured that there are suitable constants c\ and c 2 such that 
C\n q < Px{n) < C2ii q , where 7(n) denote the number of C°°-words of length n, Pxin) 
denote the number of subwords (factors) of length n which occur in the Kolakoski 
sequence K, q = (log 3)/ log(3/2). 

Weakley [26] showed that there are positive constants C\ and C2 such that for 
each n satisfying Bik — 1) + 1 < n < A(k) + 1 for some k, C\n q < 7(n) < C277. 9 , 
where A(k), B{k) denote respectively the minimum and the maximal length of FE 
words of height k ( [26] Corollary 9). 

Huang and Weakley [15] proved that for any positive number and positive integer 
no satisfying |n| 2 /|n| > | — (j) for each LDE word u of length exceeding n , there are 
two suitable constants c x and c 2 such that 

log 3 log 3 

Cl ?7>g((3/2)+0+(2/iV)) < ,y( n ) < C2n log((3/2)-0) f Qr eac fo U ^ N. 

With the best value known for <fr, and large N, this gives 



K 




22 11 2 1 22 1 22 11 2 11 22 
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2.7087 



< 71,2 (n) < c 2 n 



2.7102 
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A naturally arising question is whether or not we can establish the estimates of 
subword complexity function of smooth words for the other 2-letter alphabets. This 
paper is a study of subword complexity function of smooth words for any 2-letter 
alphabets (Theorem 10). We establish the bounds of minimal and maximal heights 
of smooth words of length n (Lemma 9), the best bounds of minimal and maximal 
heights of smooth words of length n for 2-letter even alphabets (Lemma 13) and the 
good lower and upper bounds of the subword complexity function •y a ,b(n) for 2-letter 
even alphabet {a, b} (Theorem 14), which would give "y a ,b(n) ~ cn log ( 2b ~ 1 ^ log B z~, 
where c is a suitable constant. 

The paper is structured as follows. In Section 2, we shall first fix some notations 
and introduce some notions. Second in Section 3, we give some lemmas which are 
needed to establish the estimates of the complexity function for arbitrary 2-letter 
alphabets. Third, in Section 4, we obtain the lower and upper bounds of the subword 
complexity function of smooth words. Moreover, in Section 5, we establish the good 
lower and upper bounds of the subword complexity function 7^(71) for 2-letter even 
alphabets. Finally, in Section 6, we end this paper with some concluding remarks. 

2. Definitions and notation 

Let E = {a, b} with a < b and a, b being positive integers, E* denotes the free 
monoid over E with e as the empty word. A finite word over E is an element of E*. 
If w = W1W2 ■ ■ ■ w n , Wi G E for i — 1, 2, ■ ■ • , n, then n is called the length of the word 
w and is denoted by Let \w\ a be the number of a which occur in w for a G E, 
then \w\ = \w\ a + \w\b- 

Given a word w G E*, a factor (or subword) u of w is a word u G E* such that 
there exist x, y G E* such that w = xuy. If x — e then u is called prefix. A run (or 
block) is a maximal factor of the form u = a k , a G E. Finally, N is the set of positive 
integers and the cardinal number of A is denoted by \A\ for a set A. 

The reversal (or mirror image) of u — u\U2 ■ ■ ■ u n G E* is the word u = u n u n ^\ ■ ■ - Ui 
u\. The complement (or permutation) of u — U\u<i- • -u n G E* is the word u = 
U\U2 ■ ■ ■ u n , where a = b,b = a. 

Now we generalize the definition of differentiable words, which Dekking first in- 
troduced in [11], to over arbitrary 2-letter alphabet {a, b} from the alphabet {1, 2}. 

To do so, for w G E*, r(w) denotes the number of runs of w, fr(w) and lr(w) 
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w 



denote the first and last runs of w respectively, and lfr(w) and llr(w) denote the 
lengths of the first and last run of w respectively. For example, if w = a 2 b 2b a a b 3 , then 
r(w) = 4, fr(w) = a 2 , lr(w) = b 3 , lfr(w) = 2 and llr(w) = 3. 

Then we first need to introduce the concept of the closure of a word w over S in 
order to establish the notion of different iable word for arbitrary 2-letter alphabets. 

Definition 1. Let w E S* and 

w = a ll a t2 ...(3 l \ (2.1) 
wherea e E, /3 = a if 2 \ k, or else )3 = a, 1 < ij < b fori < i < k. 

w, lfr(w) < a and llr(w) < a 

a^^w, lfr(w) > a and llr{w) < a 

wf3 b ~ tk , lfr(w) < a and llr{w) > a 

a b ~ tl wl3 b ~ tk , lfr(w) > a and llr(w) > a 

Then w is said to be the closure of a word w. 

For example, let w = 3311133313133311133, u = 3313133311, then u is a factor 
of w, and w = 333111333131333111333, u = 333131333111. Thus a is a factor of w, 
which also holds in general (see Lemma 3 (1)). 

Definition 2. Let w G X* be of the form (2.1). // the length of every run of w 
only takes a or b except for the lengths of the first and last runs, then we call that w 
is differentiate, and its derivative, denoted by D(w), is the word whose jth symbol 
equals the length of the jth run of w, discarding the first and/or the last run if its 
length is less than b. 

If w is differentiable, then we call that w is closurely differentiable. If a finite word 
w is arbitrarily often closurely differentiable, then we call w a C^-word or a smooth 
word over the alphabet {a, b}, and the set of all smooth words over the alphabet 
{a, b} is denoted by C™ b or 

Let p(w) = D(w), then it is clear that w is a smooth word if and only if there is 
a positive integer k such that p k {w) = e. 

Note that if b = a + 1 then w = w. Thus, w is differentiable if and only if w is 
closurely differentiable, which suggests that w is a smooth word if and only if there 
is a positive integer k such that D k (w) = e. 



4 



By the definition 2, it is clear that if b — a > 2 and a ^ 2, then a b 1 b a a a b b 1 is 
different iable but not closurely differentiable. Moreover, D is an operator from S* to 
E*, r(w) < \D(w)\ + 2 and 



bD(w), b > lfr(w) > a and llr{w) < a 

D(w)b, b > llr(w) > a and lfr(w) < a , 
D(w) = { K ' K ' . (2.2) 

bD(w)b, b > lfr(w) > a and b > llr(w) > a 

D(w), otherwise 

From (2.2), it follows that if w is closurely differentiable, then it must be differentiable. 

A word v such that D(v) = w is said to be a primitive of w. The two primitives 
of w having minimal length are the shortest primitives of w. For example, b have lb 2 
primitives of the form rfoPai , where a — a, b, i, j — 0, 1, ■ • • b — 1, and a b , b b are the 
shortest primitives. It is easy to see that for any word w G C°°, there are at most 2b 2 
primitives, and the difference of lengths of two primitives of w is at most 2(b — 1). 

The height of a smooth word w is the smallest integer k such that D k+1 (w) = e. 
We write ht(w) for the height of w. For example, if w = 32 3 3 3 2 3 3 2 2 2 3 2 2 3 3 3 2 3 3, then 
ht(w) = 3. 

It immediately follows from the definition 2 that 

(1) D(u) = D{u), D(u) = D{u) for each uGE* 

(2) w eC°° w,w E C°°. 



3. Some lemmas 

The following Lemmas 3 to 5 reveal the relations among the operators mirror image, 
complement, closure and derivative. 

Lemma 3 ([16], Lemma 5). Let w be a differentiable word and u is a factor of w. 
Then 

(1) both u and w are factors of w; 

(2) w = w, w = w; 

(3) D{u) is a factor of D(w); 

(4) If w is closurely differentiable, then both p(u) and D(w) are factors of p(w), 
and p{w) = p{w), p{w) = p{w). 
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Proof. (1) From the definition 1 of the closure of a word, it follows the assertion 



(2) It immediately follows from the definitions of the closure, complement and 
mirror image of a word w and the definition of the operators p. 

(3) Since it is a factor of w, by the definition 2 of the derivative of a word w, we 
see that D(u) is a factor of D{w). 

(4) Since w is closurely differentiable and p{w) = D(w), by the assertion (1), u 
and w are both factors of w. Moreover by the assertion (3), we see that D(u) and 
D(w) are factors of D(w), that is, both p(u) and D(w) are factors of p(w). Finally, 
by the assertion (2), we have p(w) = D{w) = D{w) = D{w) = p(w). Similarly, 
p{w) = D{w) = D[w) = D{&) = tfw). □ 

From the definitions 1-2, it immediately follows that 

Lemma 4 ([16], Lemma 6). Let w = W1W2 • • ■ w n be a differentiable word with n > 
a + 1. 

(1) If lfr{w) = b then w\w is not a differentiable word and D{w\w) = D{w) for 
% < b - 1; 

(2) // lfr(w) < b then D(w b ~ lfr{w) w) = bD(w); 

(3) // lfr(w) < a and r(w) > 1 then D(wiw1 l ^ w ^w) = aD(w). □ 

Lemma 5 ([16], Lemma 7). (1) Let w = W1W2 • • -w n be a smooth word. Then any 
factor of w is also a smooth word; 

(2) Any smooth word w = W1W2 ■ • ■ w n has both a left and a right smooth exten- 
sions. 

Proof. (1) If w is a smooth word and u is a factor of w, then note that w G 
C°° <^=^> p k {w) = e for some positive integer k, by Lemma 3 (4), we obtain that p l (u) 
is a factor of p l (w) for any positive integer % < k. And hence p k (w) = e suggests 
p k {u) = e, so that u is a smooth word. 

(2) We verify the assertion (2) by induction on \w\. Since D(w) = D(w), we only 
need to verify that w has a left smooth extension. It is clear that if r(w) < 1, where 
r(w) is the number of runs of w, then the assertion (2) holds. We proceed to the 
induction step. Assume now that r(w) > 2 and the assertion (2) holds for smooth 
words shorter than w. 

If lfr(w) < a then by Lemma 4 (2-3), we have D(w±w1 l ^^w) = aD(w) and 
D(w\ ^^w) = bD(w). Thus by |-D(w)| < \w\, we see that at least one of aD(w) 
and bD(w) is a smooth word, which means that w has a left smooth extension. 



(!)• 
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If b > lfr(w) > a, then by w G C°°, we obtain that w is a left smooth extension 
of w. 

If b = lfr(w), then by Lemma 4 (1), we see that Wiw is a left smooth extension of 
w. □ 

Now we are in a position to generalize the notion of LDE words to over arbitrary 
2-letter alphabets from the alphabet {1,2}, which Weakley first introduced in [26]. 

If aw and bw are both smooth, then the word w is said to be left fully extendable 
(LFE). Clearly, LFE words are closed under complement. For every nonnegative 
integer k, let LFk denote the set of LFE words of length k. 

Let ^ a ,b{k) denote the number of smooth words of length k over the alphabet 
{a, b}. Being similar to Weakley [26], define the differences of 7^ by 7a jb (&) = 
1a,b{k + 1) — 7a,b(^) for each k > 0. From the definition of LFE words, it immediately 
follows that 7afe(^) = \LFk\ for each nonnegative integer k. Since 7(0) = 7 (0) = 1, 
so 

k-1 k-1 k-1 

i=0 i=0 t=l 

Lemma 6. Let w = W1W2 ■ • • be a smooth word, where k G iV '. If w is a LFE word 
then D(w) is also a LFE word, and if k > b or r(w) > 1 then w = wfw a +i . . . Wk, 
where w\ ^ w a+ i . 

Proof. Assume that w is a LFE word of length exceeding 0. If k = \w\ < b then 

i 

it follows from both w\w and w\w being smooth words that w = a a . . . f3 a ft , where 
a G E, /3 = a if 2 \ t, otherwise j3 = a, < j, t < b — 1, j + 1 > 1, k = t ■ a + j . 
So D{w) = a l ~ l if j, t > 1, or else D(w) = e. So, in view of t < b we see that 
aD(w) and bD(w) are both smooth words, that is, D(w) is a LFE word. 

If k > b, since w\w is a smooth word, we get lfr(w) < b, which suggests that 
w = w^w a+ i . . . Wk and w a+ i ^ w± by Wiw G C°°. Moreover, note that each smooth 
word has a left smooth extension (Lemma 5 (2)), from wiw, Wiw G C°° it follows that 
aD(w)(= D( Wl w)) and bD(w){= D(w\- a w)) are both smooth words, that is, D(w) 
is a LFE word. □ 

oo 

Let LF denote the set |J LF { and P{A) = {u G LF : \u\ > and D{u) G A} for 
ACS*. We now give the number of the elements contained in P J '(e) for j G iV. 

Lemma 7. \P j (e)\ = 4(6 - 1)(26 - l)^ 1 for j G iV. 
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Proof. By Lemma 6 and the definition of P(A), we see that Pi +1 (e) is exactly 
composed of all LFE primitives of P^(s). 

Since for each LFE words of the form a ... b there are exactly 2b LFE primitives: 

^A/(a...6)y, 

where a, /3 £ E, j = 0, 1, . . . , b — 1; 7 = /3 if 2 | |a . . . 6|, or else 7 = /?. 

for each LFE words of the form a ... a there are exactly 2(6 — 1) LFE primitives: 

rA- 1 («...a) 7 ^, 

where a, (3 £ E, j = 1, . . . , 6 — 1; 7 = /3 if 2 | \a . . . a\, or else 7 = (3. 

In addition, because of a ... 6 = a ... a, we see that the numbers of LFE words of 
the form both a . . . b and a ... a are equal in all LFE words of the same heights. It 
follows that 

\pi(e)\ = 2b-\\pi-\e)\ + 2{b-l) l -\pi-\e)\ 
= (2b — l)\P j ^ 1 (e)\ for j £ N, 

which suggests that 

\P^e)\ = (2b-iy^\P(e)\. (3.2) 

Since the primitives of e are of the form a % a\ where < i, j < b — 1 and i + j > 1, 
so by a(a l a^) £ C°°, we get that if i > 1 and j > 1 then z = a, j = 1, 2, . . . , b — 1, 
if i > 1 and j = then a t+l , ad 1 £ C 00 , which suggests 1 < i < b — 1. Thus e have 
exactly 4(6 — 1) LFE primitives. Thus (3.2) gives the desired result. □ 

Lemma 8. Let £ be a positive real number and no a positive integer such that 

\u\b/\u\ > £ for every LFE word u of length exceeding n . (3.3) 

Then 

(1) l-D^)! < a\w\ for each LFE word w with \w\ > Nq, where a = 1/(1 + (a + 
b — 2)£), iVo is a suitable positive integer. 

(2) \w\ < (3\D{w)\ + q for each LFE word w, where (3 = 1 + (a + 6 — 2)(1 — £), q 
is a suitable positive constant. 

Proof. (1) Since the complement of any smooth word is still a smooth word of the 
same length and \u\ a = \u\b, the hypothesis (3.3) of Lemma 8 means that 

|w| Q /|tt| > £ for every LFE word u with \u\ > uq. (3-4) 
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It is easy to see 

\w\ = \D(w)\ + (a-l)\D(w)\ a + (b-l)\D(w)\ b + c, where < c < 2(6-1). (3.5) 

From (3.3) to (3.5), one has |iy| > (1 + (a + b - 2)£)\D(w) \ for \D(w)\ b /\D(w)\ > £, 
which implies < a\w\ for every LFE word w with \w\ > N , where N is a 

suitable positive integer such that |-D(u»)| > n as soon as \w\ > N . 

(2) As \D(w)\ a /\D(w)\ + \D(w)\ b /\D(w) \ = 1, from (3.3) and (3.4) ones get 

\D{w)\J\D{w)\ < 1 - £ for each LFE word w with \w\ > N , (3.6) 
\D(w)\ b /\D(w)\ < 1 - £ for each LFE word w with \w\ > N . (3.7) 

So, from (3.5) to (3.7) it follows that \w\ < 0\D(w)\ + 2(b - 1) for \w\ > N , which 
means that (2) also holds. □ 

The next lemma establishes the bounds of the heights of C°°-words of length n, 
which is of independent interest. 

Lemma 9. Let ht max (n) and ht m i n (n) denote respectively the maximal and the min- 
imal heights of LFE words of length n, then for any positive number £ and positive 
integer no satisfying \u\ b /\u\ > £ for each LFE word u with \u\ > no, there are two 
suitable constants t\ and t 2 such that for every positive integer n, one has 

log n 

ht min {n) > - — — — ttt + *i, 3.8 

log(l + (a + 6-2)(l -0) 

log n 

MmM < log(l + (a + fe-2)0 +t2 ' (3 - 9) 

where ti = — - log ^ +/3 ~ 1 ' > , Q and (3 are determined by Lemma 8 (2). 

Proof. First, one checks (3.9). Since |-D(u>)| < \w\ for each \w\ > 0, and |Z^(tf)| < 
a\w\ for each LFE word w satisfying \w\ > No by Lemma 8 (1). 

Let ko — 1 be the greatest height of all LFE words of length< N and m is the 
least positive integer such that if \w\ = mo, then the height of each LFE word w is 
no less than ko. Thus for every LFE word w, if \w\ > mo, then one can get 

\D k (w)\ < a k ~ ko \w\ for k > k . 

Hence 

\D k (w)\ < 1 as soon as a k ~ k °\w\ < 1 

^ k > log(M)/ log(l/a) + fc 0> 
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which means that the height k— 1 of w is smaller than log(|u>|)/ log(l + (a+6— 2)C,)+ko. 
Since there are only finite many LFE words satisfying \w\ < m , so there is a suitable 
constant t 2 such that (3.9) holds for each LFE word. 

Second, by Lemma 8 (2), one has \w\ < (3\D(w) \ + q for each LFE word w, where 
— 1 + (a + b — 2)(1 — £), q is a suitable constant, which means that 

(5 k - 1 



< /3 fc |D*(u;)| + g. 



nPi k 

< 2(b-l)f3 k + 



= (2(6-1) + -^-)^ 

= m(3 k , 

where m = 2(6 — 1) + qjifi — 1), k is the height of u>. Thus the length \w\ of a LFE 
word w of height /c is less than m/3 k , and it follows that 

k > (log |i£>| — logm)/ log/9, 

which gives the desired lower bound of ht m i n (n), where t\ = — logm/ log/3. □ 
Remark 1. (1) From (3.3) and (3.4) it immediately follows that the positive real 
number £ satisfying the condition (3.3) must be smaller than 1/2. 

(2) From the proof of Lemma 8 we easily see that if we substitute LFE words 
in Lemma 8 with some infinite subclass of smooth words, which is closed under 
complement, then the corresponding result also holds. 

(3) From the proof of Lemma 9 we see that if we replace LFE words in Lemma 9 
with some infinite subclass of smooth words, which is closed under both complement 
and the operator D, then the corresponding result still holds. 



4. The subword complexity of smooth words 

Now, we can establish our main result on subword complexity function 7 a ,fe(w) of 
smooth words over 2-letter alphabets. 

Theorem 10. For any positive real number £ and positive integer uq satisfying 
\u\b/\u\ > £ for every LFE word u with \u\ > no, there exist two suitable constants 
C\ and C2 such that 

log(2i»-l) log(2b-l) 

Cin io g (i+( a +i.-2)(i-o) < 7 0)6 (n) < c 2 n lo f5( 1 +( a + 6 - 2 )o 
for every positive integer n. 
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Proof. First, from the definition of ht max (n), one sees that the length of LFE 
words of the height larger than ht max (n) must be larger than n. Thus U^i LF^ C 
jj«2»»(»)pi( e ). So from (3.1) and Lemma 7, for any n G N, one has 

n-1 

7a , 6 (n) = 2 + ^|LF,| 

i=l 

< 2+ £ |J*( e )| 

htrnax (^) 

= 2+ ^ 4(6- 1) • (26- I)''" 1 

3=1 

= 2 ■ (2b - l) htmax{n \ (4.1) 

So combining (3.9) and (4.1) yields the desired upper bound of 7 aj &(ra), where c 2 = 
2(26- l)* 2 . 

Second, from the definition of ht m i n (n), it follows that the length of all LFE words 
with the height no more than ht m i n (n) — 1 must be less than n. Thus, again from 
(3.1) and Lemma 7, for any n G N one can get 

n-1 



7oj6 (ra) = 2 + \LF t 

i=l 
k 



e 



3=1 
k 

= 2 + ^4(6- 1) • (26- ly- 1 

3=1 

= 2-(26-l) fc , (4.2) 

where k = ht m i n (n) — 1. Thus, the desired lower bound of 7 ,&(ra) is obtained from 
(3.8) and (4.2), where c a = 2(26 - l)* 1 " 1 , h is decided by Lemma 9. □ 
Remark 2. Theorem 10 indicates that only if we could get lower and upper bounds 
of letters frequency of LFE words, then correspondingly we could obtain an estimate 
of subword complexity function 7 Q) b(n) of smooth words. 
Taking £ = {1,2} in Theorem 10, we obtain 

Corollary 11. For any positive number £ and positive integer uq satisfying (li^/H > 
£ for each LDE word u with \u\ > n , there exist two suitable constants c\ and c 2 such 



that 



log 3 log 3 

d ■ n lo s( 2 -f) < 71,2(71) < c 2 • n I °^ T +?) for each n G N. 
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It is obvious that Corollary 11 suggests the main Theorem 1 in [15]. 



5. The subword complexity of smooth words on 2- 
letter even alphabets 

Lemma 12. If w is a 2-times differentiable finite word over 2-letter even alphabet 
{a, b}, then 

(1) \\w\ a - \w\ b \ < b; 

, . 1 b \w\h 1 6 

(2) — < < - 



2 2\w\ ~ \w\ ~ 2 2\w\ 

(3) lim - — y 1 = lim - — ^ = -; 

|i«|-Kx) \w\ |io|-»oo \w\ 2 

(4) p\D(w)\ -q 2 < \w\ < p\D(w)\ + q u 



where q 1 = {p- 1)6 + 2(6 - 1), q 2 = (j>- 1)6, P = 2 f fi . 

Proof. It is obvious that (1) =>■ (2) =>- (3). So we only need to check (1) and (4). 

(1) Since w G C 2 ah , we have _D 2 (u>) G £*. Thus L> 2 (u>) = a* 1 a' 2 • • • where 
a G S, ij G iV for z = 1, 2, ■ • • , k, and if 2 | fc then (3 = a, otherwise /3 = a. It follows 
that 

ti t 2 *fe 

^ ^ 



A^(Z>») = Ti Q fi a ---72 Q 7ff2 S ---f3 a ---7ff/---7f+i; (5-1) 
D(w) = fA^(D\w)H +1 (5.2) 



where < i, j < 6 - 1, 7* e S and if 2 | t m then 7 m+i = 7 m or else 7 m+i = 7 m . 
Note that a and 6 are both even numbers, from (5.1) it immediately follows 

a a p 



A^A^pV))) = 7 71 7 71 ■ ■ - 7 71 7*7* ■ ■ -7 7 " 1 " " ■ (5.3) 

where a, /3, 7, 71, • • • , 7^+1 G S. 
Then (5.3) gives 

\^\^{D\w)))\ a = \^\^{D\w)))\ b = \ (5.4) 
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Now from (5.2) ones get 

w = f A^(i}(«j))f 

= e ci Af 1 (7i)A; 1 (A; 1 1 (^ 2 (^)))A; 1 (7^ +1 )^ 2 (5.5) 

where < i, j, c\, c% < b — 1, \i — £ if 2 | i or else p = £, 77 = £ if 2 | (z + j) or else 
rj — £. Note that 

l|A; 1 (7^ C2 | a -|A; 1 (^ +1 )r / -U<6. 
And if 

irAf (TDU^irAf (tDu 

then 

IA^CtU^U^IA; 1 ^)^^. 

Thus combining (5.4) and (5.5) produces the desired result (1). 
(4) From (1) it immediately follows that 

\w\ a + b for a G E. (5.6) 



^ + * for a G s.. (5.7) 

So, combining (3.5) and (5.7) gives the desired result (4). □ 

From Remark 1 and Lemma 12 (4), we can establish the following useful bounds 
of the heights of smooth words of length n for 2-letter even alphabets. 

Lemma 13. Let a,b be both even numbers. Then there are two constants tr , t 2 such 
that for each positive integer n, ones have 
log n 

ht min (n) > h*i, (5.8) 

logp 

log n 

ht max {n) < \-t 2 , (5.9) 

logp 



\w\a ~ 


b < 


\w\ 


\w\ = 


\w\ a 


+ 1' 


\w\ 


b 






- < 






2 ~ 



where 

h 



log(36 - 2 + 2jfc£ 



logp 

13 



, _ 9 Mffi) 

r 2 — * \ 

logp 

a + b 
P = — 

-2.3347 < < ti < -1, 0.7944 < 2 - ^ < t 2 < 2 - J^f « 1.36907. 

log 3 — x 7 log 12 — ^ — log 3 

Proof. First, from the proof of (3.8) and the right half part of Lemma 12 (4) it 
immediately follows the desired lower bound of ht m i n (n), where 

_ log(36 -2 + £&) 
ti — 



log^ 



Thus 

log b 



ti < 



2 

and if a = b — 2 then 

log(36-2 + ^) 

tl = M6=i) 

If 6 = 4 then a = 2, which means ti = — '° g ^ . For b > 4, we have 

' 1 log 3 — ' 



ln(3fc - 2 + 



In 



6+2 
2 

3fe 2 +2b-4 



ln^ 



(5.10) 



Let 



(7(6) = In 3 In In 13 In 



then 

n on 3&2 + 4 lnl3 

By Maple, we easily see that the roots of the equation 3 (In 3 — In 13)& 3 + (6 In 3 — 
21nl3)6 2 + 4(ln3 + lnl3)6 + 81n3 = are approximately equal to -1.003,-0.894, 2.229. 
Hence, since g'(4) < and g'(b) is continuous in [4, +00), we obtain g'(b) < for all 
b > 4. Therefore g(b) < g(4) = 0, which suggests 

ln 3^4 ^ lQgl3 

ln&±2 - i og 3 • 
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Then (5.10) gives h > -^gf . 

Second, we use an argument similar to the proof of (3.8) to obtain the upper 
bound of ht max (n). Note that if ht(w) > 2, then |w| > 2b. Then from the left half 
part of Lemma 12 (4), we get 

\D(w)\ < -\w\ + b. (5.11) 

Now assume w is a smooth word of length n with height k larger than or equal to 2. 
Since ht{w) > 2, from (5.11), we arrive at 



(w)\ + b 



< 


\D k - 


< 


-\D k 




P 






< 




P 2 




1 


< 






1 


< 


pfc-2 



fc " 4 H| + -b + b 
p 



w\ + ^b + --- + \b + -b 

1 , 

w + —b. 

1-p 1 



Thus 



fe-2 \ W \ i P _2 r 

p < — , where r = o, 

t p — 1 

which means 

k<p^ + 2-p^. (5.12) 
logp logp 

Note that the length n of a smooth word of height 1 is greater than or equal to 
a + 2 > 4, so 

!^ + 2 _ jogr > 2 1Q g 6(p- 2 ) > 1 
logp logp _ logp 

which means (5.12) holds for every smooth word. Now from (5.12) it immediately 
follows the desired upper bound (5.9) of ht max (n). 
From 

{b - 3) 2 - (a - l) 2 - 8 > for b > a + 4, 
we get 

(6-a)(a + 6-4) > 2(a + 6), 
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, , 2(a + b) 

2b-(a + b)> 



a + b-4' 
P~ 2 



6 7 > P> 

p- 1 

which means 

to g(ffi) , . , log 2 ^ 

— — > 1 > tor b > a + 4. 

log p log 3 

Thus if b > a + 4 then t 2 < 2 - ^ « 1.36907. 

— z — log 3 

If 6 = a + 2 then p = a + 1, so 

ln (a-l)(q+2) 

t 2 = 2 



ln(a + 1) 
Let 

f(a) = ln(3) ln ( fl - 1 )( Q + 2 ) _ ln(2 ) i n ( a + i) 

a 

then 

f(a) = ln(3) , a ' + 2 : - — 
1/17 V ; (a- l)(a + 2)a a + 1 

> ln(3)(- - r ) 

V A (a- l)(a + 2)a a + V 

= ln(3) (a 2 -l)(a + 2)a 

> for every a > 1. 

Hence, /(a) > /(2) = for each a > 2, that is, 

ln (o-lKo+2) b2 

f7 — ~Tx > for each a ^ 2 ' 

m(a +1) ln3 

log 3" 



which also gives the desired result t 2 < 2 — 



Finally, machine computation shows 

h > 2 - for b < 58. (5.13) 

log 12 



Moreover, in view of a < b, we obtain 
log b 
lo S 2 



t 2 >2- T -^ i; . (5.14) 



1(3 



And let 

h{b) = 2~, (5.15) 
m 2 

then 

. . i(ln&-ln|) 

^ 6 = n bT2 > for 6 > 4. 
(In |) 2 

which means 

h(b) > h(60) « 0.7962 > 2 - for 6 > 60. 

Thus (5.13), (5.14) and (5.15) give the desired lower bound of the constant t 2 - □ 

Theorem 14. Let a, b be both even numbers. Then there exist two suitable constants 
Ci, c 2 such that 

log(2i»-l) log(2b-l) 

Cin iog(a+i,)-io g 2 < ^ a b [n) < c2n log ( a+i, )- log2 , 
where c\ = 2(2b — l) tl_1 , c 2 = 2(26 — 1)* 2 ; tx,t2 are determined by Lemma 13. 

Proof. From the proof of Theorem 10 we easily see that (4.1) and (4.2) always hold. 
Thus combining (5.8) and (4.2) gives 

log(2f>-l) 

la,b{n) > cin log ( a+i, )- log2 . 
Similarly, from (5.9) and (4.1) it follows 

log(2ft-l) 

7a,ft(^) < C2n log ( a+6 )- log2 . □ 

6. Concluding remarks 

To establish the estimates of subword complexity function of smooth words to follow 
our thoughts and methods is an interesting problem for large alphabets S n containing 
n letters, where n > 3. 

For the 3-letter alphabet S 3 = {2, 4, 6}, let 

Wl = 64 2 2 6 6 6 4 6 6 6 2 6 4 6 , 
w 2 = 42 6 6 6 4 6 6 6 2 6 4 6 , 
w 3 = 42 6 6 6 4 6 6 6 2 6 , 
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Vl 


= 4 6 2 2 6 2 , 


v 2 


= 2 6 6 6 2 6 6 6 2 6 6 6 4 4 6 2 


V3 


= 2 6 4 6 2 2 6 2 , 


Ui 


= 2 2 6 2 4 6 , 


11 2 


= 4 4 2 2 6 2 2 2 6 2 2 2 6 2 4 6 


"3 


= 2 2 4 2 6 2 , 



then D{wi) = 26 6 , D(w 2 ) = 6 6 , D(w 3 ) = 6 5 , D{v x ) = 62, D(v 2 ) = 6 6 4, D(v 3 ) = 6 2 2, 
D(u\) = 26, D{u 2 ) = 2 6 6, D(u 3 ) = 2, we easily see that each of wi,w 2 and w 3 has 
only one left smooth extension and D(vji) has exactly i left smooth extensions for 
i = 1,2,3; each of v i, v 2 and has exactly two left smooth extensions and D(vi) has 
exactly i left smooth extensions for % — 1, 2, 3; each of Ui,u 2 and M3 has exactly three 
left smooth extensions and D{ui) has exactly i left smooth extensions for i — 1,2, 3. 
Thus for large alphabets containing at least three letters, the estimates of factor 
complexity function of smooth words become more complicated than the case for 2- 
letter alphabets. 
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